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We use a path integral representation to solve the Eigen and Crow-Kimura molecular evolution 
models for the case of multiple fitness peaks with arbitrary fitness and degradation functions. In 
the general case, we find that the solution to these molecular evolution models can be written as the 
optimum of a fitness function, with constraints enforced by Lagrange multipliers and with a term 
accounting for the entropy of the spreading population in sequence space. The results for the Eigen 
model are applied to consider virus or cancer proliferation under the control of drugs or the immune 
system. 
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I. INTRODUCTION 



It is widely accepted that biological evolution proceeds 
andscape 

several exact results U LH have been derived for the 



on a rugged fitness landscape 



ilogicf 



In the last decade, 



molecular evolution models of Eigen |(| Q and Crow- 
Kimura @ and their generalizations 0- 

In this paper, we consider a multiple-peak replication 
rate landscape as a means to model a rugged fitness 
landscape. To date, there are few rigorous results for 
multiple-peak fitness landscapes. Such results begin to 
make the connection with the biologically-relevant case of 
a rugged fitness landscape. We here derive error thresh- 
olds by means of a path integral representation for both 
the Eigen and Crow-Kimura mutation-selection schemes 
with an arbitrary number of replication rate peaks. 

We first generalize the Crow-Kimura model to the 
multiple-peak case. The solution of the single-peak ver- 
sion of this model, where the replication rate is a function 
of the Hamming distance from one configuration, was 
provided by a path integral representation in [4], |j, ll(l | . 
We here provide the solution to this model for K peaks, 
where the replication rate is a function of the Hamming 
distances from K configurations, again by means of a 
path integral. We find that the mean distances from 
the peaks maximize the replication rate, with constraints 
provided by Lagrange multipliers, and with an additional 
term that represents the entropy of the population in se- 
quence space. Explicit solutions to this maximization 
task are given for the two-peak case. 

We then generalize and solve the continuous-time 
Eigen model for K peaks, where the replication and 
degradation rates are functions of the Hamming dis- 
tances from K configurations, A solution of the discrete- 
time, single-peak Eigen model, which in a sense inter- 
polates between the Crow-Kimura and continuous-time 
Eigen model was provided in We here solve 

the continuous- time Eigen model for K peaks, again by 
means of a path integral representation. The mean dis- 
tances from the peaks maximize an excess replication rate 
with an effective mutation rate, with constraints pro- 



vided by Lagrange multipliers, and with an additional 
term that represents the entropy of the population in se- 
quence space under the effective mutation rate. 

The Eigen model was first developed to study viral evo- 
lution 0] , and we use our solution of the two-peak Eigen 
model to consider viral propagation in the presence of 
either immune system suppression or an anti-viral drug. 
The preferred viral genome exists at one point in genome 
space. Conversely, the drug or immune system suppresses 
the virus most strongly at some other point in genome 
space. These two points in genome space are the two 
peaks of the model. The viral growth rate and the sup- 
pression rate both decrease with the Hamming distance 
away from these two unique points. 

The rest of the paper is organized as follows. In Section 
2 we describe the generalization of the Crow-Kimura, or 
parallel, model to multiple peaks. We provide a solution 
of this model for an arbitrary replication rate function 
that depends on distances from K peaks. In Section 3, 
we describe the Eigen model. We provide a solution for 
arbitrary replication and degradation rate functions that 
depend on distances from K peaks. In Section 4, we use 
the Eigen model theory to address the interaction of the 
immune system with a drug. We consider both adaptable 
drugs and the original antigenic sin phenomenon |b'{| . We 
also consider tumor suppression by the immune system. 
We discuss these results and conclude in section 5. We 
provide a derivation of the path integral representation 
of the continuous-time Eigen model in the Appendix. 



II. CROW-KIMURA MODEL WITH MULTIPLE 
PEAKS 

Here we first briefly introduce the Crow-Kimura model 
@ and its quantum spin version Q so that it is eas- 
ier to understand its generalizations to be studied in the 
present paper. In the Crow-Kimura model, any genotype 
configuration i is specified a sequence of N two-valued 
spins s„ = ±1, 1 < n < N. We denote such configura- 
tion i by Si = (s\, . . . , s l N ). That is, as in 7j, we consider 



2 



s n = +1 to represent the purines (A, G) and s n = — 1 to 
represent the pyrimidines (C, T). The difference between 
two configurations Si and Sj = (sj, . . . , s J N ) is described 
by the Hamming distance <2y = (N — yj n s n s n)/^> which 
is the number of different spins between Si and Sj . The 
relative frequency Pi of configurations with a given se- 
quence 1 < i < 2 N satisfies 



(1) 



Here is the the replication rate or the number of 
offspring per unit period of time, the fitness, and Hij 
is the mutation rate to move from sequence Si to se- 
quence Sj per unit period of time. In the Crow-Kimura 
model, only single base mutations are allowed: ptij — 
•yA(dij — 1) — N r yA(dij). Here A(n) is the Kronecker 
delta. 

The fitness of an organism with a given genotype is 
specified in the Crow-Kimura model by the choice of the 
replication rate function j-j, which is a function of the 
genotype: r$ = f{Si). It has been observed [j| that the 
system (QJ, with n = f(s\,...,s z N ) evolves according 
to a Schrodingcr equation in imaginary time with the 
Hamiltonian: 



A' 



ff = 7£«-i) + /K, 



N ) 



(2) 



Here a x and a z are the Pauli matrices. The mean 
replication rate, or fitness, of the equilibrium popula- 
tion of genotypes is calculated as lim^^ V '. p j(t)rj — 
lim^^oo 4lnZ = lim^—xx, 4 InTr exp(— (3H) p). In this 
way it is possible to find the phase structure and error 
threshold of the equilibrium population. In the general- 
ized setting, the Crow-Kimura model is often called the 
parallel model. 



A. The parallel model with two peaks 



We consider two peaks to be located at two config- 
urations v^,v^,l < n < N, where — ±l,i>^ = 
±1, and the two configurations have I common spins: 
J2n=i v n v n = 21 — N. The value of I determines how close 
the two peaks are in genotype space. Now the replica- 
tion rate of configuration S is a function of the Hamming 
distances to each peak: 



r i =f(2L 1 /N-l,2L 3 /N-l) , 



(3) 



where VJ„ =1 u£s„ = 2Li~N and J2 n =i v n s n = 2L 2 - N. 

Due to the symmetry of the Hamiltonian, the equi- 
librium frequencies are a function only of the distances 
from the two peaks: pi = p(Li,L 2 ). We define the fac- 
tors x ai , a2 that describe the fraction of spins a con- 
figuration S has in common with the spins of config- 
urations 1/ , v . In particular, we define the fraction 
of spins that are equal to ctk times the value in peak 



configuration v . For K peaks, the general definition 
is x ai ... aK = (l/N)J2 n=1 , N S[s n ,aivl l }...S[s n ,a K v^}. 
For the two peak case, these factors are related to the 
distances from the configuration to each peak and to the 
distance between the peaks: 

ar+_(£i,L 2 ) - (ii - L 2 + N - 1)/{2N) 

x++{L u L2) = {L 1 +L 2 -N + l)/(2N) 

x — (Li,L 2 ) = (-L 1 -L 2 + N + l)/(2N) 

x. + {L 1 ,L 2 ) = (-L 1 + L 2 +N-l)/(2N) . (4) 

With these factors x, we find the following equation for 
the total probability at a given value of L\ and L 2 , 
P{L X ,L 2 ): 

dP(L u L 2 ) _ 2L X 2L 2 

It — - /( lv" _ lj — ~ ^ P ^ L ^ 

—^NP{L\,L 2 ) 

+7 ^2 Nx aii0l2 (Li + a>i,L 2 + a 2 ) 

Ql- ±1,Q2 — =1=1 

xP(ii + ctx,L 2 + a 2 ) 

N or 1 or' 
-P{L U L 2 ) £ f(^-l,^-l)P(L[,L' 2 ). 



L' ,L'=0 



(5) 



Only the values of L\ and L 2 satisfying the conditions 
< Li < N, \L X +L 2 -N\ < I, \Li-L 2 \ < N - I are 
associated with non-zero probabilities. Equation JSJ) can 
be solved numerically to find the error threshold and av- 
erage distance of the population to the two peaks. In the 
next section we solve this equation, and its generalization 
to K peaks, analytically. 



B. Exact solution of the K peak case by a path 
integral representation 

We consider the case of K peaks. We consider the 
replication rate to depend only on the distances from 
each peak 



/( 



2L X 



1. 



2L 



K 



N 



l) = Nf ( Ul 



,u K ) 



(6) 



where Nu k = Y,n=i v n s n = 2i fe - N, 1 < k < K. The 
observable value \(uk) is called the surface magnetization 
[l4|. or surplus J3|, for peak k. 

Characterization of the fitness function that depends 
on K peaks through the K values of Uk requires more 
than the K(K — l)/2 Hamming distances between the 
peaks. It proves convenient to define the 2 K parameters 
Ua 1 ...a K = Vi>i < i < 2 . These are defined by j/j — 
(1/N) yj n=1 Ilfc=i $( a ik, v*). Here on is the set of indices 
ol\ . . . ax, and = Ok in a i-ih set of indices ol\ . . . ax- 
The introduction of the 2 K parameters yi is one principle 
point of this article. 
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The Trotter-Suzuki method has been applied in [^, [^} 
to convert the quantum partition function for a single 
peak model into a classical functional integral. While 
calculating Z = exp[— 0H], intermediate spin configura- 
tions are introduced. We find Z is a functional integral, 
with the integrand involving a partition function of a 
spin system in 2-d lattice. In the spin system, there is a 
nearest-neighbor interaction in horizontal direction and a 
mean-field like interaction in the vertical direction. This 
spin system partition function was evaluated in 0, 
under the assumption that the field values are constant. 
A path integral representation of the discrete time Eigen 
model, which is quite similar to the parallel model, was 
introduced by Peliti |T^ . 

We here generalize this procedure to K peaks and cal- 
culate the time-dependent path integral and Ising parti- 
tion function. Since the replication rate is a function of 
K distances, the functional integral is over K fields that 
represent the K magnetizations. The path integral form 
of the partition function is 



C. Explicit results for the two peak case 

For clarity we write the expression for the case of two 

peaks. In this case, y++ + y = (1 + m)/2 and y-\ + 

= (1 - m)/2, where m = (21 - N)/N. We solve 
Eq. (TJJl f° r the fields H k and put the result into Eq. JgJ . 
We find that for a pure phase, the bulk magnetizations 
maximize the function 

In Z 

— = /o(M l7 M 2 ) 



Z = 



VM k VH k expiN / d/3' 



- ^ H k (f3')M k (/?')- 7 + N J2 Vi l n Qi \ 

fc=l J 4=1 ' 



where 



Qx = TrfX df>> [ aW ~> +a * £f=i a *" H ^')] 



(8) 

Here /3 = t, is the large time to which Eq. is solved, 
and the operator T denotes time ordering [Tg| , discussed 
in the Appendix in the context of the Eigen model. Us- 
ing that N is large, we take the saddle point. Consider- 
ing S\nZ/8M k ((3') = and SlnZ/8H k {/3') = 0, we find 
M k (f3') and H k (f3') independent of /?' is a solution. At 
long time, therefore, the mean replication rate, or fitness, 
per site becomes 



In Z 

= max 

(3N M k ,Hk 



f (M u ...,M K ) 



K 

E 

k=l 



K 

7 2 + (E^ fci/fe ) 2 

k=l 



H k M k - 7 

1/2 



We take the saddle point in H k to find 



j 2 + (Z k > =1 ^H k ,y< 



(9) 



(10) 



We note that the observable, surface, magnetization 
given by (u k ) is not directly accessible in the saddle point 
limit, but is calculable from the mean replication rate 0. 
In the one peak case one defines the observable surface 
magnetization for a monotonic fitness function as follows 
fl5| : one solves the equation fo((u)) = (\nZ)/(J3N). For 
multiple peaks, we use this same trick, considering a sym- 
metric fitness function and assuming (ui) — (7/2) . . . — 
(u K )- 



+ ^V(l+m) 2 -(M 1 +M 2 )2 
+ 7^(1 " ™) 2 ~ (Mi - M 2 ) 2 



•7. (11) 



with the constraints 



/o [Mr (/?'),..., 



- 1 < Mi < 1; -1 < M 2 < 1 
-(1 + m) < Mi + M 2 < 1 + m 
-(1 - m) < Mi - M 2 < 1 - m . 



(12) 



(7) 



In the case of a quadratic replication rate, /o = k±Mf 
k 2 M% + k z M x M 2 ., Eq. JTTJ becomes 

^| = k x M\ + k 2 M% + k 3 M 1 M 2 

7 



+iv/(l + ™) 2 -(M + M 2 ) 2 

+ 2^/(1 _ m )2_ (Ml _ M2 )2_ 7) (13) 

with the constraints of Eq. I|12|) . 

As an example, we consider the replication rate func- 
tion f = k(Ml + Mf + M 1 M 2 )/2. When m > 0, and 
the two peaks are within a Hamming distance of N/2 of 
each other, there is a solution with Mi = M 2 = M for 
which 



3fcM 2 



1 + TO 



M 



1/2 



"2 



(«l)(«2)) 



(14) 



where the observable, surface, magnetization is given by 
(Mi) = (2Li/N — 1). We find 



Mi = M 2 = M = V(l + ™) 2 /4-7 2 /(9 fc2 ) ■ (15) 
We have for the mean replication rate, or fitness, per site 



In Z 3k ( 1 + to 7 
~ ~2 \ 3fc 



that 



(u) 



1 + TO 7 

~~2 3fc 



(16) 



(17) 



4 



m 


k 


(«1> 


(u 2 ) 


{^1 ) analytic 


{^2 ) analytic 


0.93 


3.0 


0.85 


0.85 


0.853 


0.853 


0.93 


2.0 


0.80 


0.80 


0.798 


0.798 


0.7 


3.0 


0.74 


0.74 


0.738 


0.738 


0.7 


2.0 


0.68 


0.68 


0.683 


0.683 


-0.7 


3.0 


0.52 


-0.52 


0.517 


-0.517 


-0.7 


2.0 


0.35 


-0.35 


0.35 


-0.35 


-0.93 


3.0 


0.63 


-0.63 


0.631 


-0.631 


-0.93 


2.0 


0.46 


-0.46 


0.465 


-0.465 



TABLE I: Comparison between the analytical formulas Eqs. 
1171 12 II for the two peak landscape in the parallel model 
and results from a direct numerical solution of the system of 
differential equations, Eq. for sequences of length N = 
1000, vtiihp(Li,L 2 ,t = 0) = 8(L U N)S(L 2 , 1). 



When m < 0, and the two peaks are greater than a Ham- 
ming distance of N/2 of each other, there is a solution 
with Mi = —M.2 — M for which 



kM 2 

k 

: ~ 27" 

One solution is 



.1 — m. 



2 - M 2 



1/2 



1 — m 



««0 



"2 



(«1)(W2» 



Mi = -M 2 = V(l -m) 2 /4-7 2 /fc 2 , 



(18) 



(19) 



which gives for a mean replication rate, or fitness, per 
site 



In Z k ( 1 — m 



so that 



{ui) = -(u 2 ) 



1 — m 



2 
k 



(20) 



(21) 



Numerical solution is in agreement with our analytical 
formulas, as shown in Table [I] 



q N-d(ij)(i _ q y{i,i) : with the distance between two 

genomes Si and Sj given by d(ij) = (iV-^" =1 4 s n)/ 2 - 
The parameter 7 = N(l — 5) describes the efficiency of 
mutations. We take 7 = 0(1). As in Eq. ©, we take 
the replication and degradation rate to depend only on 
the spin state, in particular on the distances from each 
peak: fj = /(<%) and D{ = D(Si) where 

f(S)=Nf (u u ...,u k ), D(S) = Nd Q ( Ul , . . . ,u fc ) • 

.(23) 

We find the path integral representation of the partition 
function for the Eigen model for the K peak case in the 
limit of long time as 



Z = 



■DM k T>H k T>m Q 'Dhoexp<.N J d/3' 
fo(M 1 ,...,M K )e-^ 1 - m "'> -h m Q 

K 



-J2H k M k -d (M 1 ,...,M K ) 

fc=i 

2 K 

+Nj2y l \nQ 1 \ , 

»=i > 



(24) 



where 

Qi = Trfe^ ti/3 '[' Ta: ' lo(/3 ' )+,TZ S^=i Q * fcHfc(/3 ' 



(25) 



The Mfc are the values of the magnetization, and jmo is 
an effective mutation rate. This form is derived in the 
Appendix. Using that N is large, we take the saddle 
point. As before, we find the mean excess replication 
rate per site, f m = lim t _ ) . 00 EiPi(*)( r i _ Di)/N, from 
the maximum of the expression for Z = Trexp(— [3H). 
We find Z ~ exp(/3Nf m ), where 



Jo (Mi,., 
-do (Mi, 



.,Afjf), (26) 



and where mo , M^ are defined through the fields : 



III. EIGEN MODEL WITH MULTIPLE PEAKS 

A. Exact solution by a path integral representation 

In the case of the Eigen model, the system is defined by 
means of replication rate functions, rj, as well as degra- 
dation rates, D f 



3=1 



3=1 



(22) 

Here the frequencies of a given genome, pi, satisfy 
^2i=iPi = !• The transition rates are given by Qij = 



E^=l a ik'Hk> 



M fe = yjCLik 



/if) 



^o + (Ef=i«^) 5 



Wc define 



TO; 



(EfcLi aikHk) 2 



(27) 



(28) 



We thus find m = X)i=i Wyl 



giving Eq. ■ 



5 



fc/7 


M 


(u) 


{^) analytic 


1.2 


0.24 


0.065 


0.068 


1.4 


0.31 


0.112 


0.113 


1.6 


0.35 


0.146 


0.147 


1.8 


0.38 


0.172 


0.172 


2.0 


0.41 


0.192 


0.193 



TABLE II: Comparison between the analytical formulas Eqs. 
130H for the quadratic landscape 129H in the Eigen model and 
results from a direct numerical solution of the system of dif- 
ferential equations 7], for sequences of length TV = 4000, with 
p(u, t = 0) = 5(u, 1). We set 7 = 5. 



B. Eigen model with quadratic replication rate 
without degradation 

We apply our results to model qualitatively the inter- 
action of a virus with a drug. In some situations, one 
can describe the action of a drug against the virus sim- 
ply as a one peak Eigen model: that is, replication rate 
is a function of the Hamming distance from one peak. 
The virus may increase its mutation rate, and at some 
mutation rate there is an error catastrophe [l6| . Let us 
define the critical 7 for the replication rate function 

f Q (M) = kM 2 /2 + l • (29) 

According to our analytical solution, Eq. (|26|) . we con- 
sider the maximum of the mean excess replication rate 
per site expression 

fo(M) exp[- 7 (l - v 7 ! - M 2 )] . (30) 

The error catastrophe occurs and leads to a phase with 
71/ = when k < 7. The error threshold for this 
quadratic case is the same as in the case of the Crow- 
Kimura model Eq. Q). Numerical solution is in agree- 
ment with our analytical formulas, as shown in Table ITT1 



C. Simple formulas for two peak case 

In the two peak, K — 2, case we can define the 
from Eq. l|28[l from the system 

1 + m , , 1 — m . 
Mi = — - — (mi + m 2 ) H — (mi - m 2 ) 

M 2 = — - — (toi+to 2 ) — (mi-m 2 ) , (31) 

where we have defined mi , 777 2 in terms of the rrii from Eq. 

()28J) by m ++ — — to__ = mi +m 2 and m_| = — m y. = 

mi — m 2 , We have for the mean excess replication rate 
per site 



-^^Vi - (^i + + my 

-i^Vl - (Mi - M 2 )2/(l - m)A 
-d (M 1 ,M 2 ). (32) 

IV. BIOLOGICAL APPLICATIONS 

The Eigen model is commonly used to consider virus 
or cancer evolution. We here consider an evolving virus 
or cancer and its control by a drug or the immune sys- 
tem, using the K = 2, two-peak version of the Eigen 
model. To model this situation, we consider there to 
be an optimal genome for virus replication, and we con- 
sider the replication rate function /o(Mi, M 2 ) to depend 
only on the Hamming distance of the virus or cancer 
from this preferred genome, 7V(1 — Mi)/2. Conversely, 
there is another point in genome space that the drug or 
immune suppresses most strongly, and we consider the 
degradation rate function do (Mi, M 2 ) to depend only on 
the Hamming distance from this point, N(l — M 2 )/2. 
While each of the functions fo and d depend only on 
one of the two distances, this is multiple-peak problem, 
because both distances are needed to describe the evolu- 
tion of the system. 

A. Interaction of virus with a drug 

We first consider a virus interacting with a drug. We 
model this situation by the Eigen model with one peak in 
the replication rate function and one peak in the degra- 
dation rate function. The virus replicates most quickly 
at one point in genome space, with the rate at all other 
points given by a function that depends on the Hamming 
distance from this one point. That is, in Eq. l.'il'l we have 

{A, Mi = 1 
(33) 
1, Mi < 1 

At another point in genome space, a drug suppresses the 
virus most strongly. We consider the case of exponen- 
tial degradation, a generic and prototypical example of 
recognition [T^ : 

d (Mi,M 2 ) = e- b ^- M ^ . (34) 

Applying the multiple-peak formalism, we find two 
phases. There is a selected, ferromagnetic (FM) phase 
with Mi = 1, M 2 = to and mean excess replication rate 
per site 

f m = Ae 1 - exp(-6(l - m)) . (35) 



fm = /o(M 1 ,M 2 )exp 



There is also a non-selective NS phase, with Mi < 1. 
The values of Mi and M 2 in the NS phase are those 
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FIG. 1: A phase diagram for the interaction of virus and 
drug, according to the narrow replication advantage model, 
Eqs. I|34|l and (J2HI- We set b = 3.5 in the exponential degra- 
dation function Eq. (I34L and 7=1. As the point in genome 
space at which the drug is most effective moves toward the 
point in genome space at which the virus grows most rapidly, 
m — ► 1, a higher replication rate of the virus is required for 
the virus to survive. In the NS phase, the drug eliminates the 
virus. In inset is shown the phase diagram for the interac- 
tion of an adaptable virus and a drug, according to the flat 
peak replication advantage model, Eqs. (1341 and 1361 . We use 
Mo = 0.9 to represent a broad peak for the virus replication 
rate. 



which maximize Eg. 13 21 given the constraints of Eq. (|12fl . 
The error threshold corresponds to the situation when 
the mean excess replication rate of the FM and NS phases 
are equal. The phase diagram as a function of the opti- 
mal replication rate of the virus and the distance between 
the points of optimal virus growth and optimal virus sup- 
pression is shown in Fig. ^ The optimal replication rate 
is A, and the distance between the points of optimal virus 
growth and optimal virus suppression is N — I, where the 
parameter m is defined as m = (21 — N)/N. As the 
point in genome space at which the drug is most effec- 
tive moves toward the point in genome space at which 
the virus grows most rapidly, the virus is more read- 
ily eradicated. Alternatively, one can say that as the 
point in genome space at which the drug is most effec- 
tive moves toward the point in genome space at which 
the virus grows most rapidly, a higher replication rate of 
the virus is required for its survival. 



B. Interaction of an adaptable virus with a drug 



We now consider a virus that replicates with rate A 
when the genome is within a given Hamming distance 
from the optimal genome and with rate 1 otherwise. That 



is, in Eq. (|32(l we have 



/ (M 1 ,M 2 ) = 



A, M < Mi < 1 



1, -1 < Mi < Mr, 



(36) 



where Mq > and close to one. We suppression of the 
virus by the drug as expressed in Eq. (|34|) . 

There is again a ferromagnetic (FM) phase with a suc- 
cessful selection. In the FM phase, one has Mo < Mi < 
1 . The evolved values of Mi and M2 maximize 



Aexp 
1 + m 

1 — m 



7 1 



VI - (Mi + M 2 ) 2 /(l + m) i 
y/l - (Mi - M 2 )7(l - mf 



-d (M 1 ,M 2 ) 



(37) 



There is also a NS phase where the virus has been driven 
off its advantaged peak, Mi < Mo- In this case, one seeks 
a maximum of Eq. I|32(l with /o = 1 via Mi and M 2 in 
the range — 1 < Mi < Mq, — 1 < M 2 < 1, subject to the 
constraints of Eq. I|12|) . 

A phase diagram for this case is shown in the inset 
to Fig. ^ The broader range of the virus fitness land- 
scape allows the virus to survive under a greater drug 
pressure in model Eq. i|36[l versus Eq. (|33|) . That is, as 
the point in genome space at which the drug is most ef- 
fective moves toward the point in genome space at which 
the virus grows most rapidly, the adaptable virus is still 
able to persist due to the greater range of genotype space 
available in the FM phase. For such an adaptable virus, a 
more specific, multi-drug cocktail might be required for 
eradication. A multi-drug cocktail provides more sup- 
pression in a broader range of genome space, so that the 
adaptable drug may be eradicated under a broader range 
of conditions. 



C. Original antigenic sin 

The immune system acts much like a drug, as a nat- 
ural protection against death by infection. Prior expo- 
sure such as vaccination typically increases the immune 
control of a virus. In some cases, the immune control 
of a virus is non-monotonic in the distance between the 
vaccine and the virus [13j. This phenomenon is termed 
original antigenic sin. To model original antigenic sin, 
we consider a non-monotonic degradation function, cen- 
tered around the second peak, which represents the non- 
monotonic behavior of the binding constant, as in our 
previous model 0} ■ We fit the binding constant data 
to a sixth order polynomial in p, where p = (1 — M 2 )/2 is 
the relative distance between the recognition of the anti- 
body and the virus. The degradation function is shown 



7 



< 30 




Q\ , 1 , 1 , 1 , 1 

> -1 -0.5 0.5 1 



m 

FIG. 2: A phase diagram corresponding to the original anti- 
genic sin model is shown. The degradation function (shown 
in inset) is chosen to closely reproduce the binding con- 
stant behavior in [l3j. with a limiting degradation rate of 
do(Mi, M2 — —1) = 1. We use 7=1. The virus replication 
advantage required to escape immune system control is a non- 
monotonic function of the Hamming distance, N(m — l)/2, 
between the point in genome space at which the immune sys- 
tem is most effective and the point in genome space at which 
the virus grows most rapidly. 



in inset in Fig. [21 We consider a single peak virus repli- 
cation rate, Eq. 1(55)1, 

There is an interesting phase structure as a function 
of m. From Eq. (|32|) . we have a FM phase with Mi = 1, 
M 2 = m. We also have a non-selective NS phase, with 
M\ < 1, where Mi, Mi are determined by maximization 
of Eq. with f = 1 under the constraints of Eq. (|T2|l . 
The phase diagram for typical parameters 0, is shown 
in Fig. [5] A continuous phase boundary is observed be- 
tween the FM and NS phases. The virus replication rate 
required to escape eradication by the adaptive immune 
system depends on how similar the virus and the vaccine 
are. When the vaccine is similar to the virus, m near 1, 
a large virus replication rate is required to escape erad- 
ication. This result indicates the typical usefulness of 
vaccines in protection against and eradication of viruses. 
When the vaccine is not similar to the virus, m < 0, the 
vaccine is not effective, and only a typical virus replica- 
tion rate is required. 

When the vaccine is somewhat similar to, but not iden- 
tical to, the virus, the replication rate required for virus 
survival is non-monotonic. This result is due to the non- 
monotonic degradation rate around the vaccine degrada- 
tion peak. The minimum in the required virus replica- 
tion rate, m 0.30, corresponds to the minimum in the 
degradation rate, M2 ~ 0.30. The competition between 
the immune system, vaccine, and virus results in a non- 
trivial phase transition for the eradication of the virus. 



D. Tumor control and proliferation 

We consider cancer to be a mutating, replicating ob- 
ject, with a flat replication rate around the first peak, 
Eq. i|3b|) and Eq. i|32|) . We consider the immune sys- 
tem to be able to eradicate the cancer when the cancer 
is sufficiently different from self. Thus, the T cells have 
a constant degradation rate everywhere except near the 
self, represented by the second peak: 

( B, -1 < M 2 < M h 
d (M 1 ,M 2 ) = l ■ (38) 

[ 0, M b <M 2 <1 

To be consistent with the biology, we assume M& > 0. We 
also assume Mo > 1/2. Typically, also, the Hamming 
distance between the cancer and the self will be small, 
m will be positive and near unity, although we do not 
assume this. 

There are four possible selective, ferromagnetic phases. 
We find the phase boundaries analytically, as a function 
of m = (21 - N)/N. For mM < M b , there is a FM4 
phase with Mi = Mo and M 2 = tuMq. The mean ex- 
cess replication rate per site is f m = Ae 7 *V^o _1 ) — B. 
There is a FM3 phase with Mi = M and M 2 = M b . 
The mean excess replication rate per site is f m — 
^ e 7(V( 1 + m ) 2 -( M o+ M b) 2 +V( 1 - m ) 2 -( M o- M ") 2 - 2 )/ 2 . This 
phase is chosen over the FM4 phase when the mean ex- 
cess replication rate is greater. For tuMq > Mb there is 
a FM2 phase with Mi = M and M 2 = mM . The mean 

excess replication rate per site is f m = Ae 1 ^ l ~ M v~ 1 \ 
For mMb > M there is a FM1 phase with Mi = mM b 
and M 2 = M b . The mean excess replication rate per site 
isf m = Ae^V^-V. 

There are two non-selective phases. There is a NS1 
phase with M\ = mMb and M 2 = Mb- The mean excess 

replication rate per site is f m = e 7( -^/ 1 ~ A/ ? _1 ' . There 
is a NS2 phase with Mi = M 2 = 0. The mean excess 
replication rate per site is f m = 1 — B. 

In Fig. is shown the phase diagram for cancer pro- 
liferation. According to our previous model 0, |u| , we 
choose (1 - M b )/2 = 0.23. We choose M = 0.9 for the 
width of the advantaged cancer phase. We choose the im- 
mune suppression rate as B = 1 . As the cancer becomes 
more similar to the self, the immune control becomes less 
effective, and the replication rate required for the cancer 
to proliferate becomes less. Three of the four selective 
and one of the two non-selective phases are present for 
this set of parameters. 



V. DISCUSSION AND CONCLUSION 

We have used the Eigen model to consider the inter- 
action of a virus or cancer with a drug or the immune 
system. One can also use the parallel model to represent 
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ent stages, or grades, through which tumors typically 
progress. 

Our results are a first step toward making the connec- 
tion with evolution on rugged fitness landscapes, land- 
scapes widely accepted to be accurate depictions of na- 
ture. We have applied our solution of these microscopic 
complex adaptive systems to model four situations in bi- 
ology: how a virus interacts with drug, how an adapt- 
able virus interacts with a drug, the problem of original 
antigenic sin |13| . and immune system control of a pro- 
liferating cancer. 
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Appendix 



the replication dynamics of the virus or the cancer. This 
would be an interesting application of our formalism. 

Another application of the formalism would be to con- 
sider explicitly the degradation induced by multi-drug 
cocktails. That is, one would consider one peak to repre- 
sent the preferred virus genome and K — 1 degradation 
peaks to represent the K — 1 drugs. We note that in the 
general case, the yi parameters depend on the explicit 
location of the drug degradation peaks, not simply the 
distance between them. Results from this application of 
the formalism could be quite illuminating as regards the 
evolution of multi-drug resistance. 

In conclusion, we have solved two common evolution 
models with general fitness, or replication and degrada- 
tion rate, landscapes that depend on the Hamming dis- 
tances from several fitness peaks. Why is this important? 
First, we have solved the microscopic models, rather than 
assuming a phenomenological macroscopic model. As 
is known in statistical mechanics, a phenomenological 
model may not always detect the fine structure of criti- 
cal phenomena. Second, approximate or numerical solu- 
tions, while useful, do not always explicitly demonstrate 
the essence of the phenomenon. With analytical solu- 
tions, the essence of phenomenon is transparent. Third, 
we have derived the first path integral formulation of the 
Eigen model. This formulation may prove useful in other 
studies of this model of molecular evolution. 

Our results for cancer are a case in point. There are 
four stable selective phases and two stable non-selective 
phases. These results may help to shed light on the, 
at present, poorly understood phenomena of interaction 
with the immune system, and on why the immune re- 
sponse to cancer and to viruses differs in important ways. 
These phases could well also be related to the differ- 



In this section, we derive the path integral represen- 
tation for the solution to the Eigen model. For simplic- 
ity we show the derivation for the K = 1 case. To our 
knowledge, this is the first path integral expression rep- 
resentation of the solution to the Eigen model. This path 
integral representation allows us to make strong analytic 
progress. We start from the quantum representation of 
the Eigen model 0, ^| . The Hamiltonian is given by 



1=0 



l<i 1 <i 2 <.--<ii<N 



xf (a z )-Nd {a z ) 
« iVe-V^E^? f (a z ) - Nd (a*) , (39) 

where we have used the fact that with j/N small, we 
need to consider only I <C N spin flips. The partition 
function is decomposed by a Trotter factorization: 

Z = Tre-' m 

= Tr{S 1 \e-f )H ' L \S 2 ){S 2 \e-f m ' L \S 3 ) . . . 

*{S L \e-W L \Si) . (40) 



Here 



(S^e-^^lSt) 

(0N/L) 



= (Si-i\e 



(3N 



s, 



d (Ys l n /N) 



15V 



(41) 
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We use the notation M t = Y^, n S n/N ■ We find 
f3N 



L 

{Si-i\Si) 



0N 



d Q (Mi 



/o(M,)e 



(42) 



where e 23 — j/N. We thus find 



a; = A(dj) 



1 



L 



-d (Mi 



+ ^-e^f (Mi)e Bd ' ,(43) 



where di = J2 n ( s n ls n ~ !)• To represent this in path 
integral form, we consider 

1 



(27T)* 



dhdmdipe 



&tNe~~< f (Mt)e Bm e-<f> m -AtNd (M t ) 



xe 
1 

2^ 



ipd—h(m—d) 



dhdm 



xe Sm 5(m-d) 



= / <im 



6(d)5(m - d)e~ AtNdo(M ' ) + AtNe^f (Mi) 

xe Bm S(m~d)6(m-d) 

= S(d)e- AtNd "^ + Me- 1 / o (M l )e B,i {(0) 

= (5(0) [A(d)e- AtAr<io(M() + AiiYe- 7 /o(Af ; )e S<i j ,(44) 

where At = (3/L. We note that, had we used a Fourier 
representation of the delta function on the finite domain 
[— A/2, A/2), instead of the infinite domain (— oo,oo), 
the expression 2ttS(0) simply becomes A; moreover such 
a finite representation of the delta function is a suffi- 
ciently accurate representation of the A(di) constraint 
when A N. Ignoring the constant prefactor 5(0) 
terms, we can write the full partition function as 



Z = Tr J VipVhVm 



xe 



J2 l i>ldl+(0N/L)J2il- h l m l+ h l< 1 '- 



(45) 



We now introduce the integral representation of the con- 
straint 5[((3/L)(NMt - J2n sl n)]- After reseating Bm t -> 
mi, hi Bhi we find 

Z = Tr J V^VhVmVHVMe i0N/L) ^i [e ^ MMl)em ' 

xe e-1'i m i/ B -do(M l )-h l 7n l -H l M l ] 

xe (p/ L )T, l H ^ n ^+J2 l ^+' 3NBh ^ L ^J2J s ^ ls ^- 1 ) . 

(46) 



xe 



We note by an expansion of the 
exp[((3N/L)e-y {Mi)e m ' exp(-^mj/B)] = 
ET t = [^ N / L >~yo(Mi)e m '} k 'eM-ki^irni/B)/ki\ 
term in Eq. (|46|1 to first order in (3N/ L that the integral 
over gives nothing more than 5(—kimi/B + di) for 
h = 0, 1. This condition, however, is already enforced by 
the hi field when fc; = 1 and by the mi field when ki = 
if we take as a rule to disallow mutations when hi = 0. 
We can, thus, remove the integral over ip, removing the 
5(0) that we anticipated, to find 

Z = [ VhVmVHVMe {pN/L) ^'' MMl)emi - MMl) 

-hi mi -HiMi\Q i j- 47 ) 

where 

Q = Tre (J3/L) ^i Hl £» sl n HPNB/L) h, Ej^L-i)^ 

(48) 

Here Q is the partition function of N 1-d Ising 
models of length L. Here F enforces the con- 
straint of disallowing mutations when hi = 0: F = 
nf =1 A{A(hi)[J2 n (s l ~ 1 s l n -l)}}. We note that Q = Qf , 
where Qi is the partition function of one of these models. 
We are not, at this point, allowed to assume that the Hi 
or hi fields are constant over I. Indeed, by Taylor series 
expanding the first term in Eq. 1471 and integrating over 
to/, we find that hi — or hi = L/(f3N). For hi = 0, 
we disallow mutations, as formalized by F . Thus, we can 
replace e~ 2B by jti/N, where ti — if hi — 0, and ti = 1 
if hi — L/((3N). We evaluate the partition function Qi 
with an ordered product of transfer matrices. To first or- 
der in (3/ L the matrix at position I is given by 7] = J + e; 
where 



(Oil 




L 


N 


7*! 


mi 


N 


L 


0Hl 




L 


L 


0-yh, 


§Hl 


L 


L 



(49) 



We find 



Qi = Tr\{Ti 

i 



(50) 



We rescale h — > /1/7 and to — > 7717 and take the continu- 
ous limit to find 



(51) 



where the operator f indicates (reverse) time ordering, 
and /3' = j3(L — l)/L. We find the form of the partition 
function to be 

Z = VhVmVHVM 
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N f dp'[e—t f (M)e~ 1m ~d (M)-hm-HM]+NlnQ 1 



(52) 



-(i? 2 



2N1/2 



(54) 



Noting the N prefacing the entire term in 
the exponential, we take the saddle point. 
We note th at 5Q 1 /5H(f3')\ H (/3>)=H,h(P')=h = 
((3H/VH 2 + /i 2 )2sinh(/V# 2 + h 2 ) and 

5Qi/5 h(/3')\ H (p >)=H,h(i3')=h = 

(fth/VH 2 + h 2 )2sinh((3VH 2 + h 2 ). We, thus, find 
a solution of the saddle point condition to be fields 
H, M, h, m independent of (3 that maximize 



One can also derive Eq. (|54|l by means of a series expan- 
sion in (3, a "high temperature" expansion. 



N M 



h{M)e 



d (M) — hra — HM] 



+ ln[2 cosh(/ViT 2 + h 2 ) 



(53) 



when the fields are averaged over a range A/3 = 0(l/N) 
by the saddle point limit. In the limit of large (3, we find 



—— — max 

(3N M,H,m,h 



fo(M)e-~<e< m ~ d (M) - hm - HM 



The generalization of the path integral representation 
to the multiple-peak Eigen case proceeds as in the parallel 
case. One introduces K fields for the magnetizations, 
M , and K fields enforcing the constraint, H k . One 
also finds in the linear field part of the Ising model the 

sum SfcLi H i J2 n v n s n instead of simply Hi J2 n s l n - Tlic 
definition of the yi and the atk allows one to rewrite this 
in the form that leads to Eq. Q) or (J23J. 
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